{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "view-in-github"
      },
      "source": [
        "\u003ca href=\"https://colab.research.google.com/github/tensorflow/tpu/blob/master/tools/colab/shakespeare_with_tpu_and_keras.ipynb\" target=\"_parent\"\u003e\u003cimg src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/\u003e\u003c/a\u003e"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "N6ZDpd9XzFeN"
      },
      "source": [
        "##### Copyright 2018 The TensorFlow Hub Authors.\n",
        "\n",
        "Licensed under the Apache License, Version 2.0 (the \"License\");"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "cellView": "form",
        "colab": {},
        "colab_type": "code",
        "id": "KUu4vOt5zI9d"
      },
      "outputs": [],
      "source": [
        "# Copyright 2018 The TensorFlow Hub Authors. All Rights Reserved.\n",
        "#\n",
        "# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
        "# you may not use this file except in compliance with the License.\n",
        "# You may obtain a copy of the License at\n",
        "#\n",
        "#     http://www.apache.org/licenses/LICENSE-2.0\n",
        "#\n",
        "# Unless required by applicable law or agreed to in writing, software\n",
        "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
        "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
        "# See the License for the specific language governing permissions and\n",
        "# limitations under the License.\n",
        "# =============================================================================="
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "innBbve1LdjE"
      },
      "outputs": [],
      "source": [
        ""
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "edfbxDDh2AEs"
      },
      "source": [
        "## Predict Shakespeare with Cloud TPUs and Keras"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "RNo1Vfghpa8j"
      },
      "source": [
        "## Overview\n",
        "\n",
        "This example uses [tf.keras](https://www.tensorflow.org/guide/keras) to build a *language model* and train it on a Cloud TPU. This language model predicts the next character of text given the text so far. The trained model can generate new snippets of text that read in a similar style to the text training data.\n",
        "\n",
        "The model trains for 10 epochs and completes in approximately 5 minutes.\n",
        "\n",
        "This notebook is hosted on GitHub. To view it in its original repository, after opening the notebook, select **File \u003e View on GitHub**."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "dgAHfQtuhddd"
      },
      "source": [
        "## Learning objectives\n",
        "\n",
        "In this Colab, you will learn how to:\n",
        "*   Build a two-layer, forward-LSTM model.\n",
        "*   Use distribution strategy to produce a `tf.keras` model that runs on TPU version and then use the standard Keras methods to train: `fit`, `predict`, and `evaluate`.\n",
        "*   Use the trained model to make predictions and generate your own Shakespeare-esque play.\n",
        "\n",
        "\n",
        "\n",
        "\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "QrprJD-R-410"
      },
      "source": [
        "## Instructions"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "_I0RdnOSkNmi"
      },
      "source": [
        "\u003ch3\u003e  \u0026nbsp;\u0026nbsp;Train on TPU\u0026nbsp;\u0026nbsp; \u003ca href=\"https://cloud.google.com/tpu/\"\u003e\u003cimg valign=\"middle\" src=\"https://raw.githubusercontent.com/GoogleCloudPlatform/tensorflow-without-a-phd/master/tensorflow-rl-pong/images/tpu-hexagon.png\" width=\"50\"\u003e\u003c/a\u003e\u003c/h3\u003e\n",
        "\n",
        "   1. On the main menu, click Runtime and select **Change runtime type**. Set \"TPU\" as the hardware accelerator.\n",
        "   1. Click Runtime again and select **Runtime \u003e Run All**. You can also run the cells manually with Shift-ENTER. "
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "kYxeFuKCUx9d"
      },
      "source": [
        "TPUs are located in Google Cloud, for optimal performance, they read data directly from Google Cloud Storage (GCS)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "Lvo0t7XVIkWZ"
      },
      "source": [
        "## Data, model, and training"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "xzpUtDMqmA-x"
      },
      "source": [
        "In this example, you train the model on the combined works of William Shakespeare, then use the model to compose a play in the style of *The Great Bard*:\n",
        "\n",
        "\u003cblockquote\u003e\n",
        "Loves that led me no dumbs lack her Berjoy's face with her to-day.  \n",
        "The spirits roar'd; which shames which within his powers  \n",
        "\tWhich tied up remedies lending with occasion,  \n",
        "A loud and Lancaster, stabb'd in me  \n",
        "\tUpon my sword for ever: 'Agripo'er, his days let me free.  \n",
        "\tStop it of that word, be so: at Lear,  \n",
        "\tWhen I did profess the hour-stranger for my life,  \n",
        "\tWhen I did sink to be cried how for aught;  \n",
        "\tSome beds which seeks chaste senses prove burning;  \n",
        "But he perforces seen in her eyes so fast;  \n",
        "And _  \n",
        "\u003c/blockquote\u003e\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "KRQ6Fjra3Ruq"
      },
      "source": [
        "### Download data\n",
        "\n",
        "Download *The Complete Works of William Shakespeare* as a single text file from [Project Gutenberg](https://www.gutenberg.org/). You use snippets from this file as the *training data* for the model. The *target* snippet is offset by one character."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "j8sIXh1DEDDd"
      },
      "outputs": [],
      "source": [
        "!wget --show-progress --continue -O /content/shakespeare.txt http://www.gutenberg.org/files/100/100-0.txt"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "AbL6cqCl7hnt"
      },
      "source": [
        "### Build the input dataset"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Z7nbGKAHi0dx",
        "colab_type": "text"
      },
      "source": [
        "We just downloaded some text. The following shows the start of the text and a random snippet so we can get a feel for the whole text."
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "aJiYai-GjQRk",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "!head -n5 /content/shakespeare.txt\n",
        "!echo \"...\"\n",
        "!shuf -n5 /content/shakespeare.txt"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "E3V4V-Jxmuv3"
      },
      "outputs": [],
      "source": [
        "import numpy as np\n",
        "import tensorflow as tf\n",
        "import os\n",
        "\n",
        "import distutils\n",
        "if distutils.version.LooseVersion(tf.__version__) < '1.14':\n",
        "    raise Exception('This notebook is compatible with TensorFlow 1.14 or higher, for TensorFlow 1.13 or lower please use the previous version at https://github.com/tensorflow/tpu/blob/r1.13/tools/colab/shakespeare_with_tpu_and_keras.ipynb')\n",
        "\n",
        "# This address identifies the TPU we'll use when configuring TensorFlow.\n",
        "TPU_WORKER = 'grpc://' + os.environ['COLAB_TPU_ADDR']\n",
        "\n",
        "SHAKESPEARE_TXT = '/content/shakespeare.txt'\n",
        "\n",
        "def transform(txt):\n",
        "  return np.asarray([ord(c) for c in txt if ord(c) < 255], dtype=np.int32)\n",
        "\n",
        "def input_fn(seq_len=100, batch_size=1024):\n",
        "  \"\"\"Return a dataset of source and target sequences for training.\"\"\"\n",
        "  with tf.io.gfile.GFile(SHAKESPEARE_TXT, 'r') as f:\n",
        "    txt = f.read()\n",
        "\n",
        "  source = tf.constant(transform(txt), dtype=tf.int32)\n",
        "\n",
        "  ds = tf.data.Dataset.from_tensor_slices(source).batch(seq_len+1, drop_remainder=True)\n",
        "\n",
        "  def split_input_target(chunk):\n",
        "    input_text = chunk[:-1]\n",
        "    target_text = chunk[1:]\n",
        "    return input_text, target_text\n",
        "\n",
        "  BUFFER_SIZE = 10000\n",
        "  ds = ds.map(split_input_target).shuffle(BUFFER_SIZE).batch(batch_size, drop_remainder=True)\n",
        "\n",
        "  return ds.repeat()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "Bbb05dNynDrQ"
      },
      "source": [
        "### Build the model\n",
        "\n",
        "The model is defined as a two-layer, forward-LSTM, the same model should work both on CPU and TPU.\n",
        "\n",
        "Because our vocabulary size is 256, the input dimension to the Embedding layer is 256.\n",
        "\n",
        "When specifying the arguments to the LSTM, it is important to note how the stateful argument is used. When training we will make sure that `stateful=False` because we do want to reset the state of our model between batches, but when sampling (computing predictions) from a trained model, we want `stateful=True` so that the model can retain information across the current batch and generate more interesting text."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "yLEM-fLJlEEt"
      },
      "outputs": [],
      "source": [
        "EMBEDDING_DIM = 512\n",
        "\n",
        "def lstm_model(seq_len=100, batch_size=None, stateful=True):\n",
        "  \"\"\"Language model: predict the next word given the current word.\"\"\"\n",
        "  source = tf.keras.Input(\n",
        "      name='seed', shape=(seq_len,), batch_size=batch_size, dtype=tf.int32)\n",
        "\n",
        "  embedding = tf.keras.layers.Embedding(input_dim=256, output_dim=EMBEDDING_DIM)(source)\n",
        "  lstm_1 = tf.keras.layers.LSTM(EMBEDDING_DIM, stateful=stateful, return_sequences=True)(embedding)\n",
        "  lstm_2 = tf.keras.layers.LSTM(EMBEDDING_DIM, stateful=stateful, return_sequences=True)(lstm_1)\n",
        "  predicted_char = tf.keras.layers.TimeDistributed(tf.keras.layers.Dense(256, activation='softmax'))(lstm_2)\n",
        "  return tf.keras.Model(inputs=[source], outputs=[predicted_char])"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "VzBYDJI0_Tfm"
      },
      "source": [
        "### Train the model\n",
        "\n",
        "First, we need to create a distribution strategy that can use the TPU. In this case it is TPUStrategy. You can create and compile the model inside its scope. Once that is done, future calls to the standard Keras methods `fit`, `evaluate` and `predict` use the TPU.\n",
        "\n",
        "Again note that we train with `stateful=False` because while training, we only care about one batch at a time."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "ExQ922tfzSGA"
      },
      "outputs": [],
      "source": [
        "tf.keras.backend.clear_session()\n",
        "\n",
        "resolver = tf.contrib.cluster_resolver.TPUClusterResolver(TPU_WORKER)\n",
        "tf.contrib.distribute.initialize_tpu_system(resolver)\n",
        "strategy = tf.contrib.distribute.TPUStrategy(resolver)\n",
        "\n",
        "with strategy.scope():\n",
        "  training_model = lstm_model(seq_len=100, stateful=False)\n",
        "  training_model.compile(\n",
        "      optimizer=tf.keras.optimizers.RMSprop(learning_rate=0.01),\n",
        "      loss='sparse_categorical_crossentropy',\n",
        "      metrics=['sparse_categorical_accuracy'])\n",
        "\n",
        "training_model.fit(\n",
        "    input_fn(),\n",
        "    steps_per_epoch=100,\n",
        "    epochs=10\n",
        ")\n",
        "training_model.save_weights('/tmp/bard.h5', overwrite=True)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "TCBtcpZkykSf"
      },
      "source": [
        "### Make predictions with the model\n",
        "\n",
        "Use the trained model to make predictions and generate your own Shakespeare-esque play.\n",
        "Start the model off with a *seed* sentence, then generate 250 characters from it. The model makes five predictions from the initial seed.\n",
        "\n",
        "The predictions are done on the CPU so the batch size (5) in this case does not have to be divisible by 8.\n",
        "\n",
        "Note that when we are doing predictions or, to be more precise, text generation, we set `stateful=True` so that the model's state is kept between batches. If stateful is false, the model state is reset between each batch, and the model will only be able to use the information from the current batch (a single character) to make a prediction.\n",
        "\n",
        "The output of the model is a set of probabilities for the next character (given the input so far). To build a paragraph, we predict one character at a time and sample a character (based on the probabilities provided by the model). For example, if the input character is \"o\" and the output probabilities are \"p\" (0.65), \"t\" (0.30), others characters (0.05), then we allow our model to generate text other than just \"Ophelia\" and \"Othello.\""
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 0,
      "metadata": {
        "colab": {},
        "colab_type": "code",
        "id": "tU7M-EGGxR3E"
      },
      "outputs": [],
      "source": [
        "BATCH_SIZE = 5\n",
        "PREDICT_LEN = 250\n",
        "\n",
        "# Keras requires the batch size be specified ahead of time for stateful models.\n",
        "# We use a sequence length of 1, as we will be feeding in one character at a \n",
        "# time and predicting the next character.\n",
        "prediction_model = lstm_model(seq_len=1, batch_size=BATCH_SIZE, stateful=True)\n",
        "prediction_model.load_weights('/tmp/bard.h5')\n",
        "\n",
        "# We seed the model with our initial string, copied BATCH_SIZE times\n",
        "\n",
        "seed_txt = 'Looks it not like the king?  Verily, we must go! '\n",
        "seed = transform(seed_txt)\n",
        "seed = np.repeat(np.expand_dims(seed, 0), BATCH_SIZE, axis=0)\n",
        "\n",
        "# First, run the seed forward to prime the state of the model.\n",
        "prediction_model.reset_states()\n",
        "for i in range(len(seed_txt) - 1):\n",
        "  prediction_model.predict(seed[:, i:i + 1])\n",
        "\n",
        "# Now we can accumulate predictions!\n",
        "predictions = [seed[:, -1:]]\n",
        "for i in range(PREDICT_LEN):\n",
        "  last_word = predictions[-1]\n",
        "  next_probits = prediction_model.predict(last_word)[:, 0, :]\n",
        "  \n",
        "  # sample from our output distribution\n",
        "  next_idx = [\n",
        "      np.random.choice(256, p=next_probits[i])\n",
        "      for i in range(BATCH_SIZE)\n",
        "  ]\n",
        "  predictions.append(np.asarray(next_idx, dtype=np.int32))\n",
        "  \n",
        "\n",
        "for i in range(BATCH_SIZE):\n",
        "  print('PREDICTION %d\\n\\n' % i)\n",
        "  p = [predictions[j][i] for j in range(PREDICT_LEN)]\n",
        "  generated = ''.join([chr(c) for c in p])  # Convert back to text\n",
        "  print(generated)\n",
        "  print()\n",
        "  assert len(generated) == PREDICT_LEN, 'Generated text too short'"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "colab_type": "text",
        "id": "2a5cGsSTEBQD"
      },
      "source": [
        "## What's next\n",
        "\n",
        "* Learn about [Cloud TPUs](https://cloud.google.com/tpu/docs) that Google designed and optimized specifically to speed up and scale up ML workloads for training and inference and to enable ML engineers and researchers to iterate more quickly.\n",
        "* Explore the range of [Cloud TPU tutorials and Colabs](https://cloud.google.com/tpu/docs/tutorials) to find other examples that can be used when implementing your ML project.\n",
        "\n",
        "On Google Cloud Platform, in addition to GPUs and TPUs available on pre-configured [deep learning VMs](https://cloud.google.com/deep-learning-vm/),  you will find [AutoML](https://cloud.google.com/automl/)*(beta)* for training custom models without writing code and [Cloud ML Engine](https://cloud.google.com/ml-engine/docs/) which will allows you to run parallel trainings and hyperparameter tuning of your custom models on powerful distributed hardware.\n"
      ]
    }
  ],
  "metadata": {
    "accelerator": "TPU",
    "colab": {
      "collapsed_sections": [
        "N6ZDpd9XzFeN"
      ],
      "name": "Predict Shakespeare with Cloud TPUs and Keras",
      "provenance": [],
      "version": "0.3.2"
    },
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}
